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ABSTRACT 

The  multi-backend  database  system(MBDS)  is  a  research  effort  conducted 
jointly  by  the  Naval  Postgraduate  School  and  Ohio  State  University  with  the 
sponsorship  of  the  STARS  foundation.  The  MBDS  is  designed  to  overcome  the 
capacity  growth  and  performance  gain  problems  of  the  traditional  database  sys- 
tems and  the  single-backend  database  systems. 

The  original  MBDS  supported  four  primary  operations  -  INSERT. 
RETRIEVE.  DELETE,  and  UPDATE.  This  thesis  presents  the  implementation 
of  a  fifth  primary  operation.  RETRIEVE-COMMON.  This  operation  is  used  to 
merge  the  records  of  two  files  which  satisfy  a  particular  query  and  share  a  com- 
mon value  for  given  attributes.  The  preliminary  design  is  discussed  in  the  Naval 
Postgraduate  Thesis  "Design,  Analysis  and  Implementation  of  the  Primary 
Operation.  Retrieve-Common,  of  the  Multi-Backend  Database  System  (MBDS)" 
by  Hsiang-Lung  Tung. 


TABLE  OF  CONTENTS 

I.  AN    INTRODUCTION  8 

A.  THE   DESIGN  REQUIREMENTS 11 

B.  THE  PROCESS  STRUCTURE 12 

1.  The  Processes  of  the  Controller 12 

2.  The  Processes  of  the  Backends  14 

C.  THE  ORGANIZATION  OF  THE  THESIS 15 

II.  THE  PRIMARY  OPERATIONS  OF  MBDS  16 

A.  THE  ATTRIBUTE-BASED  DATA  MODEL 16 

B.  THE  DIRECTORY  STRUCTURES 17 

C.  THE  DATA  MANIPULATION  OPERATIONS  17 

D.  THE  RETRIEVE-COMMON  OPERATION  19 

1.  The  Syntax  of  the  Retrieve-Common  Operation   '. 19 

2.  An  Overview  of  the  Processing  21 

III.  MODIFICATIONS  REQUIRED  TO  SPECIFICATIONS  22 

A.  THE  EXECUTION  OF  THE  RETRIEVE-COMMON  REQUEST 22 

B.  THE  DEVIATIONS  FROM  THE  SPECIFICATIONS 27 

1.  The  Test  Interface  Process 27 

2.  The  Request  Preparation  Process   28 

3.  The  Concurrency  Control  Process  29 

4.  The  Parallel  Communications  Link  Process  29 

5.  The  Directory  Management  Process 29 

6.  The  Record  Processing  Process  29 

5 


IV.  THE  DESIGN  OF  DATA  STRUCTURES  FOR  THE  IMPLEMENTATION  32 

A.  THE  RPRID  INFO  MODIFICATIONS  32 

B.  THE  HASHLNGINFO  STRUCTURE  34 

C.  THE  HASHRESULT  STRUCTURE  37 

D.  THE  BLOCK  STRUCTURE 38 

V.  THE  TESTING 39 

A.  THE  TESTING  PROCESS  39 

B.  THE  DETAILS  OF  THE  TESTING  PROCESS  41 

VI.  THE  CONCLUSION 44 

APPENDIX  A  -  APPENDIX  A:   THE  TESTING  RESULTS 47 

APPENDIX  B  -  APPENDIX  B:   A  WALK  THROUGH  THE  USER  INTERFACE  ..  53 

LIST  OF  REFERENCES 63 

INITIAL  DISTRIBUTION  LIST  64 


ACKNOWLEDGEMENT 

This  thesis  is  part  of  ongoing  database  systems  research  efforts  being  conducted  at 
the  Laboratory  for  Database  Systems  Research  at  the  Naval  Postgraduate  School, 
Monterey.  Ca  93943,  under  the  direction  of  Dr.  David  K.  Hsiao.  This  research  is 
supported  by  grants  from  the  Department  of  Defense  STARS  Program,  and  from  the 
Office  of  Naval  Research. 

I  would  like  to  express  my  sincere  appreciation  to  the  following  individuals: 

Dr.  David  K.  Hsiao  for  his  guidance  and  wisdom  in  leading  me  through  this  effort. 

Steve  Demurjian.  a  PhD  student  in  Computer  Science,  who  is  currently  serving  as  a 
research  assistant  at  the  Naval  Postgraduate  School.  He  provided  technical  support 
during  all  phases  of  this  thesis  and  proved  to  be  an  invaluable  asset  to  the  MBDS 
project. 

Finally,  a  special  thanks  goes  to  my  wife.  Debby.  for  her  typing  support  and,  more 
importantly,  her  patience  and  understanding  during  the  past  nine  months. 


I.   AN  INTRODUCTION 

Database  computers  must  be  capable  of  performing  operations  on  data  in  a 
timely  and  efficient  manner.  Applications  require  data  to  be  stored  on  auxiliary 
storage  devices  and  only  through  an  efficient  organization  of  software  and 
hardware  can  the  database  operations  be  performed  quickly.  The  multi- 
backend  database  system  (MBDS)  at  the  Laboratory  for  Database  Systems 
Research  at  the  Naval  Postgraduate  School  in  Monterey,  California  is  one  such 
database  computer.    [Ref.  l]. 
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Figure  1.1    The  Multi-Backend  Database  System  (MBDS) 
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MBDS  consists  of  one  controller  and  one  or  more  backend  computers.  As 
can  be  seen  in  Figure  1.1.  requests  to  access  the  database  are  entered  in  MBDS 
from  either  a  host  computer  or  a  terminal.  The  controller  receives  the  requests, 
checks  their  validity,  and  broadcasts  the  requests  to  all  of  the  backends.  The 
backends  are  independent  computers  and  each  of  them  has  a  dedicated  disk 
system.  The  database  records  are  evenly  distributed  across  the  backends  with 
each  record  being  stored  in  only  one  backend's  disk  drive.  A  backend  can  begin 
processing  a  request  the  moment  it  is  received.  This  allows  parallel  processing  of 
the  request.  Since  the  requests  can  also  be  queued  at  the  backend  allowing  for 
greater  utilization  of  resources,  it  allows  concurrent  processing  of  requests.  The 
goal  is  for  MBDS  to  provide  results  in  a  time  proportional  to  the  number  of 
backends  connected.  Thus,  two  backends  should  perform  nearly  twice  as  fast  as  a 
single  backend  when  the  database  size  remains  constant.  While  the  backends  are 
performing  the  database  operations,  the  controller  is  available  for  more  requests  to 
be  accepted  and  for  providing  the  output.  This  design  allows  the  continuous 
processing  of  requests  with  no  bottlenecks. 

The   controller   and   backends   communicate   using   an    Ethernet.     The   data 

placed   on   this   bus   is   in   a   message   format   which   includes   the   sender  of  the 

message,  the  receiver  of  the  message,  the  type  of  message,  and  if  appropriate,  a 

message   body.     Each  computer  connected   to  the   Ethernet  has  routines  which 

allow  the  computer  to  place  messages  on  the  bus  via  the  Put-NET  process  and 

extract  messages  from  the  bus  via  the  Get-NET  process. 
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Originally.  MBDS  had  four  primary  operations  -  INSERT.  DELETE. 
UPDATE,  and  RETRIEVE.  This  thesis  presents  the  implementation  of  the  fifth 
primary  operation.  RETRIEVE-COMMON.  The  design  and  analysis  of  this 
operation  is  performed  by  Hsiang-Lung  Tung  in  his  NPS  Thesis  "Design. 
Analysis,  and  Implementation  of  the  Primary  Operation,  Retrieve-Common,  of 
the  Multi-Backend  Database  System".  The  retrieve-common  operation,  similar  to 
a  relational  join,  allows  the  records  of  two  files  to  be  combined  if  the  records  share 
a  common  value  for  a  prescribed  category.  This  is  a  powerful  and  desirable 
function  with  many  useful  applications.  As  an  example,  suppose  a  motor  vehicle 
department  maintains  two  files,  namely,  one  for  licensed  drivers  and  the  other  for 
registered  vehicles.  In  order  to  store  as  little  redundant  data  as  possible,  the  only 
field  which  these  two  files  have  in  common  is  the  social  security  number  of  the 
driver  of  one  file  and  owner  of  the  other  file.  Using  the  retrieve-common,  it  is 
possible  to  merge  these  two  files  on  the  common  values  of  the  social  security 
number.  The  result  will  be  longer  records  with  all  the  personnel  data  of  the 
licensed  operator  connected  with  the  vehicle  data  of  the  vehicle  registered  to  the 
same  social  security  number.  Further,  from  this  information  it  is  trivial  to  trace  a 
license  plate  number  to  the  owner  of  a  vehicle  and  his  address. 

The  rest  of  this  chapter  presents  background  information  to  familiarize  the 
reader  with  MBDS.      A  brief  discussion  of  the  design  principles  which  MBDS  is 

based  on   is  presented  in  the  next  section.     The  organization  of  the  software  is 

i 

presented   and  the  relationship  between  the  existing  processes  and  this  thesis  is 
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discussed.      Finally,   we   present    an   overview   of  the   remaining  chapters   in   this 
thesis. 

A.    THE  DESIGN  REQUIREMENTS 

Three  requirements  have  been  defined  for  the  design  of  MBDS.  First.  MBDS 
must  be  easily  expandable.  There  are  two  reasons  for  expanding  a  database 
system.  First,  the  existing  hardware  may  not  be  large  enough  to  hold  the  volume 
of  records  needed  in  the  database.  Second,  the  existing  system  is  slow  in 
providing  results.  These  two  reasons  are  commonly  referred  to  as  the  capacity- 
growth  problem  and  the  performance-gain  problem,  respectively.  MBDS 
assures  expandability  by  providing  identical  software  and  hardware  to  all 
backends.  The  controller  needs  to  know  the  number  of  backends  being  used  and 
this  is  provided  by  the  user  during  the  initial  start-up. 

The  second  design  requirement  is  that  the  hardware  and  software  be  generic. 
This  allows  for  a  system  which  can  be  easily  expanded  without  regard  to  special 
hardware  features.  The  software  must  be  portable  so  that  when  a  new  backend  is 
added  the  code  can  be  transferred  from  an  existing  backend  without  regard  to 
the  make  or  model.  The  result  of  this  design  requirement  is  that  MBDS  can  be 
upgraded  by  adding  the  state-of-the-art  equipment  to  an  existing  system. 

The  third  requirement  suggests  that  the  parallelism  is  to  be  exploited.  This 
requires  that  the  records  are  to  be  evenly  distributed  across  the  backends.  Over 
the  time,  this  allows  the  workload  of  each  backend  to  be  the  same.     Further, 
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requests  are  processed  concurrently,  allowing  parallel  accesses  to  the  database.  It 
is  even  possible  for  backends  to  be  processing  different  requests  at  any  instant. 
Since  each  backend  can  queue  requests,  once  a  backend  has  completed  one  request 
it  can  check  the  queue  to  see  if  further  requests  have  been  queued  for  processing. 

These  design  requirements  have  resulted  in  an  efficient,  highly-utilized, 
portable  system  which  may  overcome  the  performance-gain  and  capacity-growth 
problems. 

B.    THE  PROCESS  STRUCTURE 

The  software  of  MBDS  is  organized  into  a  process  structure  as  seen  in  Figure 
1.2,  The  host  computer  has  one  process,  the  test  interface  (TI).  TI  provides 
the  user  with  three  levels  of  menus.  Level  1  is  used  to  provide  system  commands. 
Level  2  provides  database  initialization  commands  such  as  generating  a  database, 
loading  a  database,  executing  the  request  interface,  and  exiting  from  the  system. 
The  third  menu  is  invoked  when  the  option  for  executing  the  request  interface  has 
been  chosen.  This  menu  allows  the  user  to  select  a  particular  file  of  requests  to  be 
executed,  modify  an  existing  file  of  requests,  store  the  output  received  from 
requests  to  a  particular  file,  and  save  the  results  of  execution-related  tasks.  The 
test  interface  also  provides  for  the  format  of  output  upon  completion  of  a  request. 

1.     The  Processes  of  the  Controller 

The  controller  has  five  processes.      The  Request  Preparation  (REQP) 
process  receives  requests  from  the  test  interface  and  formats  the  request.    If  the 
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request  is  syntactically  correct.  REQP  sends  the  newly  formated  request  to  the 
backends.     The    Insert    Information    Generation    (IIG)    process    is    used    to 


The  Controller 


Each  Backend 

Figure  1.2  The  MBDS  Process  Structure 
provide  information  to  the  backends  for  insert  requests.  Inserts  must  be  handled 
specially  to  insure  that  the  data  is  distributed  evenly  across  the  backends.  The 
Post  Processing  (PP)  process  is  used  to  terminate  a  request  and  provide  any 
results  which  may  have  been  received  from  the  backends  to  the  user.  Get-NET 
(GNET)  and  Put-NET   (PN^T)  are  the  processes  responsible  for  sending  and 
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receiving  messages  on'  Ethernet.     These  processes  determine  the   receiver  of  the 
message  and  the.  route  of  the  message  to  the  receiving  process. 
2.     The  Processes  of  the  Backends 

In  addition  to  the  two  communications  processes,  GXET  and  PNET, 
each  backend  has  three  processes.  The  Directory  Management  (DM)  process 
is  responsible  for  identifying  the  secondary  storage  addresses  necessary  to  access 
records.  To  perform  this  task,  three  tables  are  maintained  -  the  Attribute 
Table  (AT),  the  Descriptor-to-Descriptor-Id  Table  (DDIT).  and  the 
Cluster-Definition  Table  (CDT).  AT  maps  directory  attributes  to  their 
descriptors:  DDIT  assigns  a  unique  id  to  each  descriptor:  and  CDT  maps 
descriptor  ids  to  cluster  ids.  By  referencing  these  tables,  DM  is  able  to  determine 
disk  addresses  of  those  records  in  which  results  of  a  particular  request  may  be 
found.  This  method,  referred  to  as  clustering,  provides  for  an  efficient  technique 
to  access  only  the  relevant  portion  of  the  secondary  storage.  The  Concurrency 
Control  (CC)  process  is  used  to  guarantee  consistency  of  directory  data  and  user 
data.  MBDS  allows  descriptors,  clusters,  and  secondary  storage  addresses  to  be 
changed  dynamically.  Thus.  CC  must  restrict  and  control  their  access  during  such 
changes.  The  Record  Processing  (RECP)  process  receives  the  request  and  disk 
addresses  from  DM,  accesses  the  secondary  storage  for  the  addressed  records,  and 
executes  the  request  against  the  records.  It  is  in  this  process  that  the  retrieve- 
common  operation  is  implemented. 
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C.    THE  ORGANIZATION  OF  THE  THESIS 

The  remainder  of  the  thesis  is  organized  as  follows.  In  Chapter  II  we  present 
the  four  original  primary  operations  and  provide  the  syntax  and  an  example  of 
the  new  primary  operation,  retrieve-common.  In  Chapter  III  we  present  a 
detailed  examination  of  the  processing  logic  of  the  new  operation.  This  is 
followed  by  necessary  changes  to  the  original  specifications  as  they  are  required  in 
the  implementation.  Chapter  IV  provides  the  data  structures  which  have  been 
added  or  modified  in  the  course  of  the  implementation  of  the  operation.  A 
walk-through  the  user  interface  is  presented  in  Chapter  V  to  provide  users  with 
the  details  of  how  to  execute  a  retrieve-common  request.  In  Chapter  VI  the  test 
procedures  used  to  verify  the  program's  correctness  is  discussed.  Finally. 
Chapter  VII  concludes  this  thesis.  This  thesis  is  intended  to  present  the  details 
of  the  implementation  of  the  retrieve-common  and  provide  the  guidance  necessary 
to  properly  use  the  new  operation. 
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II.  THE  PRIMARY  OPERATIONS  OF  MBDS 

In  this  chapter  we  present  the  primary  database  operations  of  MBDS.  We 
first  present  the  attribute-based  data  model  of  MBDS  to  orient  the  reader  to  the 
terminology  used  with  the  operations.  Then,  we  discuss  the  four  original 
operations  and  provide  examples  of  each.  Finally  we  present  the  syntax  and  an 
example  of  the  retrieve-common  operation.  This  chapter  concludes  with  an 
overview  of  the  processing  for  the  retrieve-common  operation. 

A.    THE  ATTRIBUTE-BASED  DATA  MODEL 

MBDS  uses  an  attribute-based  model  in  the  design  of  the  database  [Ref.  1  : 
pp  9-14].  This  model  calls  for  records  to  be  stored  as  a  set  of  attribute-value 
pairs.  The  attribute  s  used  to  describe  a  class  or  certain  characteristic  of  which 
the  values  are  a  part.  The  second  component  is  a  value  for  the  attribute.  This 
value  can  be  a  string  or  an  integer,  but  all  values  for  a  given  attribute  must  be 
consistent.    Attribute-value  pairs  are  enclosed  in  brackets  such  as: 

<STATE.  Vt> 
A   record   is   a   grouping  of  attribute-value  pairs  and   a  record   body  where  no 
attribute  appears  more  than  once  in  the  record.  The  record  body  consists  of  a 
string  of  textual  data.       We  use  the  brackets.  {,  },  to  enclose  the  record  body.     A 
file   is  used  to  name  a  collection  of  records  which  are  grouped  under  common 
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characteristics    and    formats,    say.    having    a    similar   set    of   attributes.     Finally. 
several  files  of  related  data  are  together  referred  to  as  a  database. 

B.  THE  DIRECTORY  STRUCTURES 

MBDS  uses  director}'  data  to  manage  the  database.  Three  constructs  are 
used  for  this  function  -  attributes,  descriptors,  and  clusters.  As  described  above, 
attributes  describe  the  category  or  characteristic  of  the  user  data.  Descriptors 
provide  ranges  into  which  the  attribute  values  may  be  partitioned.  For  example, 
the  attribute  "U.S.  States"  could  use  a  set  of  three  descriptors  [Alabama  - 
Hawaii],  [Idaho  -  Texas],  and  [Utah  -  Wyoming].  Observe  that  any  set  of 
descriptors  must  be  mutually  exclusive.  From  the  user  defined  descriptors, 
clusters  are  formed.  A  cluster  is  a  group  of  records  such  that  every  record  in  the 
cluster  satisfies  the  same  descriptor.  For  instance,  using  the  above  set  of 
descriptors,  records  containing  the  U.S.  States  such  as  Alaska.  Connecticut,  and 
Delaware  must  be  in  the  same  cluster  because  their  U.S.  State  attribute  values 
alphabetically  fall  between  Alabama  and  Hawaii.  Clustering  is  a  very  important 
principle  for  MBDS  because  it  allows  indexing  and  thus  only  those  clusters  which 
may  have  data  which  satisfies  a  request  will  be  accessed  during  processing. 

C.  THE  DATA  MANIPULATION  OPERATIONS 

The  original  MBDS  recognized  four  primary  operations  to  support  the 
database.    These  four  are  INSERT,  DELETE.  UPDATE,  and  RETRIEVE. 
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The  INSERT   operation  is   used   to   place   a   record   into   the  database.     The 
syntax  for  this  request  is: 

INSERT  Record 
where  Record  is  a  set  of  attribute-value  pairs  and  record  body.  The  following 
example  inserts  a  record  into  the  US  file: 

INSERT  (<FILE.  US>,  <STATE,  Vt>.  <TOWN.  Morgan>.  {...}) 
The  DELETE  operation  is  used  to  remove  a  record  or  set  of  records  from  the 
database.    The  syntax  for  this  request  is: 

DELETE  Query 
The  query  (in  parenthesis)  is  used  to  select  the  records  which  the  operation  will  be 
performed  on.    For  instance,  the  following  example  deletes  all  Vermont  records 
from  the  US  file: 

DELETE  ((FILE  =  US)  and  (STATE  =  Vt)) 
The  L^PDATE  operation  is  used  to  modify  records  which  are  already  in  the 
database.    The  syntax  for  the  UPDATE  request  is: 

UPDATE  Query  [Modifier] 
The  query  is  used  to  select  the  records  which  will  be  updated  and  the  modifier  (in 
brackets)  specifies  the  type  of  change  which  will  be  performed  on  those  records. 
The  following  example  will  change  the  population  of  Morgan.  Vt  to  500: 
UPDATE  ((FILE  =  US)  and  (STATE  =  Vt)  and  (TOWN  =  Morgan))  <POP  =  500> 
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The  RETRIEVE  operation   is  used  to   locate   and  return   to  the  user  those 
records  which  satisfy  the  query.    The  syntax  of  a  retrieve  request  is: 

RETRIEVE  Query  (target-list)  [BY  Attribute] [WITH  Pointer] 
The  query  specifies  which  records  are  to  be  selected.  The  target-list  provides  the 
attribute  values  of  that  record  which  are  to  be  returned  to  the  user.  The  BY 
clause,  as  an  option,  is  used  with  the  aggregate  operations.  AVG.  COUNT.  SUM, 
MIN.  and  MAX.  When  this  clause  is  used,  the  records  selected  are  grouped  to 
perform  the  aggregate  operation.  The  WITH  clause,  as  an  option,  is  used  to 
return  pointers  to  the  retrieved  records  for  later  use  with  an  UPDATE  request. 
An  example  of  a  retrieve  which  will  return  all  states  in  the  United  States  with  a 
town  named  Morgan  is: 

RETRIEVE  ((FILE  =  US)  and  (TOWN  =  Morgan))  (STATE) 
In  this  example  the  query  is  (FILE  =  US)  and  (TOWN  =  Morgan).    The  target- 
list  is  STATE.    The  BY  and  WITH  clauses  are  not  used. 

D.    THE  RETRIEVE-COMMON  OPERATION 

1.     The  Syntax  of  the  Retrieve-Common  Operation 

The  retrieve-common  operation  is  used  to  merge  the  records  of  two  files 
which  share  a  common  value  for  specified  attributes.  The  syntax  for  the  retrieve- 
common   resembles  the  syntax  of  the  retrieve  request.     This   allows  the  actual 
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■selection   of  records   from   secondary   storage   to   proceed  exactly   as   two   retrieve 
requests.    The  syntax  for  the  retrieve-common  is: 


RETRIEVE  Query- 1  (target-list-l)[BY  Attribute] [WITH  Pointer] 

COMMON  ( Attribute- 1.  Attribute-2) 

RETRIEVE  Query-2  (target-list-2)[BY  Attribute]  [WITH  Pointer] 


The  first  retrieve  is  referred  to  as  the  source  retrieve  and  the  second  retrieve  is 
called  the  target  retrieve.  Attribute- 1  references  those  records  which  satisfy 
the  source  retrieve  and  Attribute-2  is  used  with  the  target  retrieve.  These 
attributes  are  used  to  select  the  values  which  must  be  identical  in  order  to 
connect  a  source  record  to  a  target  record.  These  two  attributes  need  not  be  the 
same  for  the  merge  to  occur,  only  their  values  do.     As  an  example: 


RETRIEVE  (FILE  =  US)  (STATE.  TOWN) 

COMMON  (TOWN,  CITY) 

RETRIEVE  (FILE  =  Canada) (PROW  CITY) 


could  return: 

(<STATE.  VtxTOWN.  MorganxPROW  Quebec xCITY.  Morgan>) 
or    more    generally,   for    the    United    States    and    Canada    find    all   states,    towns, 
provinces,  and  cities,  where  the  town  and  city  are  identical. 
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2.      An  Overview  of  the  Processing 

The  logical  operation  of  the  retrieve-common  is  as  follows: 

a.  The  retrieve-common  request  is  modified  into  two  retrieve  requests  by 
placing  the  common  attributes  into  the  target  list  of  the  source  retrieve 
and  the  target  retrieve,  respectively. 

b.  All  of  the  records  which  satisfy  the  source  retrieve  are  gathered,  the 
common  attribute  value  is  hashed,  the  records  are  placed  in  the  virtual 
memory,  and  the  hashed  addresses  are  stored  in  the  hash  tables. 

c.  All  of  the  records  which  satisfy  the  target  retrieve  are  collected,  and  the 
hash  values  are  calculated.  These  records  are  also  placed  in  the  virtual 
memory  and  their  addresses  are  stored  into  another  hash  table. 

d.  The  target  records  of  the  backend  are  transmitted  to  all  of  the  other 
backends  to  be  hashed  with  the  local  target  records.  In  this  way.  each 
backend  has  only  local  source  records,  but  has  every  target  record  which  is 
in  the  database. 

e.  To  perform  the  pairwise  merge,  the  backend  checks  if  the  first  value  in 
each  source  record  is  the  same  as  the  first  value  in  each  target  record,  since 
the  common  attributes  have  been  placed  in  the  front  of  the  target  list  so 
that  their  values  would  be  the  first  value  listed  for  all  records.  If  the  two 
values  are  the  same,  the  records  are  concatenated  and  outputted. 
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III.   MODIFICATIONS  REQUIRED  TO  SPECIFICATIONS 

The  original  specifications  for  the  retrieve-common  operation  are  presented  in 
[Ref  2].  When  reviewing  the  specifications  for  the  initial  stages  of  the 
implementation  we  have  found  a  number  of  basic  errors  and  oversights.  These 
basic  errors  have  been  quickly  and  easily  corrected.  Throughout  the  course  of  the 
-mplementation  and  integration  we  have  also  encountered  some  major  logical  and 
program  flow  errors.  These  major  errors  resulted  in  a  partial  redesign  of  the 
program  flow  regarding  the  execution  of  the  retrieve-common  request.  In  the  rest 
of  this  chapter,  we  first  present  the  execution  steps  for  the  retrieve-common 
operation.    Then  we  detail  all  of  the  modifications  to  the  original  specifications. 

A.    THE  EXECUTION  OF  THE  RETRIEVE-COMMON  REQUEST 

This  section  describes  the  sequence  of  actions  for  executing  a  retrieve-common 
request.  Figure  3.1  lists  all  of  the  types  of  messages  used  to  control  MBDS. 
Figure  3.2  displays  the  controller  and  backend  processes  and  the  messages  which 
are  passed  within  the  system  for  a  retrieve-common  operation.  The  order  in 
which  the  messages  are  passed  is  denoted  alphabetically  (e.g.,  "a"  is  first).  The 
digit  following  the  letter  is  the  type  of  message  as  listed  in  figure  3.1. 

A  retrieve-common  request  originates  in  the  test-interface  (TI)  process.  This 
process  allows  the  user  to  choose  whether  a  single  request  is  being  generated  or 
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MESSAGE-TYPE  NUMBER  AND  NAME 


1  Traffic  Unit 

2  Request  Results 

3  Number  of  Requests  in  a  Transaction 

4  Aggregate  Operators 

5  Requests  with  Errors 

6  Parsed  Traffic  Unit 

7  New  Descriptor  Id 

8  Backend  Number 

9  Cluster  Id 

10  Request  for  New  Descriptor  Id 

11  Backend  Results  for  a  Request 

12  Backend  Aggregate  Operator  Results 

13  Record  that  has  Changed  Clusters 

14  Results  of  a  Retrieve  Caused  by  Update 

15  Descriptor  Ids 

16  Request  and  Disk  Addresses 

17  Changed  Cluster  Response 

18  Fetch 

19  Old  and  New  Values  of  Attribute  being  Modified 

20  Type-C  Attributes  for  a  Traffic  Unit 

21  Desc-Id  Groups  for  a  Traffic  Unit 

22  Cluster  Ids  for  a  Traffic  Unit 

23  Release  Attribute 

24  Release  all  Attributes  for  an  Insert 

25  Release  Descriptor-Id  Groups 

26  Attribute  Locked 

27  Descriptor-Id  Groups  Locked 

28  Cluster  Ids  Locked 

29  Generated  Inserts  Completed 

30  Request  Id  of  a  Completed  Request 

31  L'pdate  Request  has  Completed 

32  Source  Retrieve-Common  has  Completed 

33  Notification  of  a  Retrieve-Common  Request 

34  Target  Retrieve-Common  Records 

Figure  3.1     The  MBDS  Message  Types 
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whether  a  transaction  of  several  requests  is  being  built.  As  each  request  is 
composed,  the  user  only  inputs  that  information  which  changes  from  one  request 
to  another  such  as  the  query  and  the  target  list.  The  user  is  not  responsible  for 
the  format  of  the  request;  this  is  automatically  generated. 

The  request  is  then  sent  to  the  Request  Preparation  (REQP)  process  for 
parsing,  syntax  checking,  and  formatting  into  a  request  table  (al).  REQP  notifies 
Post  Processing  (PP)  of  the  number  of  requests  in  the  transaction  (b3)  and  the 
aggregate  operator  of  the  request  (c4).  Upon  completion  of  these  actions,  REQP 
notifies  Record  Processing  (RECP)  of  the  new  retrieve-common  request  (d33). 
REQP  then  sends  the  parsed  traffic  unit  to  Directory  Management  (DM)  (e6). 
DM  calls  on  the  Concurrency  Control  (CC)  process  to  lock  the  directory 
attributes  (f20).  After  they  have  been  locked,  CC  notifies  DM  (g26)  and  DM 
begins  descriptor  search  for  both  the  source  and  target  retrieve.  Once  this  is 
completed.  DM  notifies  CC  to  release  the  locks  on  the  attributes  (h23)  and  DM 
broadcasts  the  descriptor  ids  of  the  source  and  target  retrieves  to  the  other 
backends  (ilo).  The  DM  in  the  other  backends  are  also  sending  their  descriptor- 
ids  to  the  DM  in  this  backend  (jlo).  The  backends  use  the  information  received 
from  the  other  backends  to  form  descriptor-id  groups.  These  groups  are  then  sent 
to  CC  to  be  locked  (k2l).  After  CC  notifies  DM  that  these  groups  are  locked. 
(m27)  DM  performs  cluster  search  and  notifies  CC  to  release  the  locks  for  both 
retrieves  (n25).  Next,  DM  sends  the  cluster  ids  for  the  retrieval  to  CC  (o22).    CC 

notifies    DM    when    the    clusters    have    been    locked    (p28).     At    this    time.    DM 
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determines  the  disk  addresses  for  this  request.  DM  then  sends  the  source  retrieve 
request  and  its  disk  addresses  to  RECP  (ql6|.  When  RECP  finishes  executing 
the  source  retrieve  and  has  stored  the  records  in  the  virtual  memory,  it  notifies 
DM  that  the  source  retrieve  has  completed  processing  and  the  target  retrieve  and 
its  disk  addresses  can  be  sent(r32).  RECP  performs  the  necessary  processing  for 
the  target  retrieve  and  stores  the  records  which  satisfy  the  request  into  the  virtual 
memory.  The  last  record  stored  for  this  request  carries  a  completion  flag  which 
indicates  that  the  last  target  record  has  been  stored.  This  flag  signals  the  backend 
to  broadcast  the  target  records  to  the  other  backends  (s34).  The  backends  must 
closely  monitor  the  progress  of  the  two  requests  because  target  records  from  other 
backends  may  be  received  (t34)  before  the  source  retrieve  has  been  processed  at 
this  backend.  Two  flags  are  used  to  control  this  -  a  source  completed  flag,  and  a 
target  count  flag  which  counts  the  number  of  backends  having  finished  sending 
target  records.  The  pairwise  merge  of  records  can  not  be  started  until  both  the 
source  retrieve  rias  completed  and  all  of  the  records  from  the  other  backends  have 
been  received.  Once  the  merging  of  records  has  been  completed,  the  results  are 
sent  to  PP  (u2)  which  removes  the  request  from  the  active  request  table  and 
sends  the  records  to  TI  (v2)  for  formatting  the  output  and  sending  the  results  to 
the  host. 
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B.    THE  DEVIATIONS  FROM  THE  SPECIFICATIONS 

Since  the  time  that  the  specifications  have  been  designed  [Tun85].  several 
versions  of  MBDS  have  been  implemented.  These  versions  have  been  combined, 
resulting  in  a  system  which  uses  multi-templates,  multi-computers,  and  the 
retrieve-common  operation.  Differences  in  the  specifications  are  not  discussed. 
Interested  readers  should  consult  [Ref  3]  and  [Ref  4]  for  more  information. 
Presented  in  the  following  paragraphs  are  errors  and  omissions  of  the 
specifications  listed  by  process.  Two  procedures.  Insert  Information  Generation 
(IIG)  and  PP  have  not  required  any  modification. 

1.     The  Test  Interface  Process 

The  TI  process  has  been  omitted  from  the  retrieve-common  specifications. 
This  process  performs  the  vital  functions  of  receiving  the  input  from  the  host 
computer"  and  displaying  the  results.  A  new  procedure.  Tl-retrieve-common,  has 
been  implemented  to  allow  the  user  to  build  a  new  retrieve-common  request. 
This  procedure  makes  use  of  the  existing  "build-retrieve"  procedure  by  calling  it 
twice,  once  to  build  the  source  retrieve  and  once  to  build  the  target  retrieve. 
Placed  between  these  two  calls  is  a  procedure  which  builds  the  common  attributes 
and  places  them  in  the  proper  format.  The  procedure  TI-ReqRes-Output  has 
been  modified  to  provide  a  format  compatible  with  the  attribute-based  model. 
that  is,  records  formed  as  attribute-value  pairs. 
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2.      The  Request  Preparation  Process 

The  specifications  of  the  REQP  process  accurately  described  the  method 
to  properly  parse  the  retrieve-common  request.  One  deviation  from  the  pseudo- 
code which  the  implementation  used  has  been  for  style  more  than  accuracy.  The 
specifications  called  for  five  flags  to  be  used  to  recognize  which  attribute  in  the 
retrieve-common  is  being  parsed.  For  instance,  if  flagl  is  true,  the  common 
attribute-1  is  being  parsed,  if  flag2  is  true,  then  common-attribute-2  is  being 
parsed,  etc.  These  flags  resulted  in  a  long  if-then-else  ladder.  Rather  than  using 
these  flags,  a  single  variable  is  used  which  ranges  in  value  from  zero  to  five.  A 
case  statement  is  used  to  determine  which  attribute  is  being  parsed  based  on  the 
value  of  the  variable. 

Another  modification,  which  has  not  been  anticipated  in  the 
specifications,  is  the  need  to  notify  RECP  of  a  retrieve-common  request  prior  to 
notifying  DM.  During  testing,  it  has  become  apparent  that  some  backends 
broadcast  target  records  before  the  other  backends  are  aware  that  the  request 
exists.  This  results  in  the  backend  not  having  a  buffer  available  to  store  the  in- 
coming records.  The  key  to  solving  this  problem  is  to  have  REQP  notify  RECP 
of  the  request  number  of  the  retrieve-common.  Then,  RECP  immediately  creates 
buffer  space  for  both  the  source-retrieve  and  the  target-retrieve. 
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3.  The  Concurrency  Control  Process 

There  has  been  a  need  to  add  the  retrieve-common  type  to  case 
statements  in  some  of  the  procedures,  but  the  logical  flow  remained  the  same  as 
for  the  retrieve  request. 

4.  The  Parallel  Communications  Link  Process 

The  modifications  which  were  required  to  the  PCL  processes  included 
adding  the  three  new  message  types  (SourceFinished.  Bucketlnformation.  and 
RP-RetComNotification)  with  the  routing  instructions  for  each  message  and 
specifying  which  process  is  to  receive  messages  of  each  type. 

5.  The  Directory  Management  Process 

The  original  specifications  called  for  DM  to  hold  the  target-retrieve  from 
all  processing  until  the  source-retrieve  had  been  completed.  Rather  than  hold  the 
target-retrieve  processing  at  this  point,  both  retrieves  perform  attribute  search, 
descriptor  search,  cluster  search,  and  address  generation  concurrently.  After 
address  generation,  while  the  source  retrieve  is  sent  to  RECP.  the  target  retrieve 
is  held  at  DM.  This  requires  a  buffer  to  store  the  disk  addresses  for  the  target 
retrieve.  When  notification  arrives  that  the  source-retrieve  has  finished,  the 
request  and  disk  addresses  are  sent  to  RECP  for  processing. 

6.  The  Record  Processing  Process 

The  majority  of  the  retrieve-common  implementation  occurs  in  this 
process.     Almost    1200    lines    of    source    code    have    been    added.     Due    to    the 
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complexity  of  the  message  passing  and  processing  logic,  several  differences,  exist 
between  the  specifications  and  the  implementation. 

As  discussed  in  the  REQP  section  above,  a  message  is  sent  from  REQP  to 
RECP  announcing  a  new  retrieve-common  request  is  being  processed.  This 
message  activates  a  routine  which  allocates  and  partially  initializes  a  RP-rid-info 
structure.  This  structure  is  used  by  RECP  to  store  all  of  the  information  known 
about  the  request.  This  initialization  includes  allocating  hash  tables  in  the  event 
that  target  records  arrive  from  other  backends  which  may  be  processing  this 
request  quicker.  The  structure  can  only  be  partially  initialized  at  this  time 
because  DM  is  using  the  request  table  and  generating  the  disk  addresses  which 
will  be  needed  to  complete  this  structure.  When  the  disk  addresses  have  been 
generated.  DM  sends  this  information  to  RECPROC  and  the  rest  of  the 
initialization  takes  place. 

The  specifications  call  for  a  global  table  to  be  available  which  provides 
the  disk  addresses  of  the  hash  tables.  Rather  than  using  a  global  structure,  a 
hashing-information  structure  is  used  which  is  attached  to  the  RP-rid-info 
structure.    This  structure  is  discussed  in  detail  in  Chapter  IV. 

The    hashing    procedures    have    been    implemented    as    written    in    the 

specifications:   however,  rather  than  three  lengthy  procedures  for  small  integers. 

large  integers,  and  strings,  one  procedure  is  used.    The  only  difference  in  the  way 

these  three  types  of  values  are  processed  is  the  manner  in  which  the  hash  value 

(bucket)  is  calculated.    For  this  part  of  the  logic  a  case  statement  is  used. 
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The  final  change  to  the  specifications  has  been  the  method  of  managing 
auxiliary  memory.  The  specifications  require  a  deliberate  step  to  be  made  by 
placing  the  hash  tables  into  known  disk  addresses.  Rather  than  performing  this 
task,  the  implementation  uses  the  logical  addresses  for  the  hash  tables  and  lets  the 
operating  system  page  the  records  into  and  out  of  the  secondary  storage.  This 
greatly  reduces  the  complexity  of  the  code. 
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IV.    THE  DESIGN  OF  DATA  STRUCTURES  FOR  THE  IMPLEMENTATION 

The  data  structures  in  the  implementation  of  the  retrieve-common  operation 
are  selected  based  on  three  design  requirements.  First,  the  data  structures  must 
provide  a  buffer  for  the  selected  source  records  and  target  records  to  be  stored. 
Second,  the  retrieve-common  data  structures  should  be  hidden  from  data 
structures  used  by  other  operations.  This  results  in  reducing  the  possibility  of 
other  operations  inadvertently  using  the  retrieve-common  structures.  The  third 
requirement  is  for  the  retrieve-common  structures  to  use  the  memory  efficiently. 
To  meet  this  third  requirement,  the  data  structures  should  be  dynamically 
allocated  only  when  a  retrieve-common  operation  is  processed.  Other  operations 
should  not  be  allocated  any  of  the  retrieve-common  structures. 

To  satisfy  these  requirements,  only  the  record  processing  data  structures  are 
modified.  The  following  sections  describe  the  modifications  to  the  existing 
structures  and  the  creation  of  new  structures  from  a  top-down  viewpoint. 

A.    THE  RP  RIDJNFO  MODIFICATION 

The  RP  rid  info  (Record  Processing  request  identification  information) 
structure  is  a  high-level  structure  which  is  allocated  each  time  RECP  is  notified  of 
a  new  request.  This  structure  holds  information  about  the  request  such  as  a 
pointer  to  the  request  number,  a  pointer  to  the  request  table,  a  pointer  to  the 
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result  buffer,  and  a  pointer  to  the  new  request  which  has  been  received,  as  can  be 
seen  in  Figure  4.1. 


struct    RP   rid   info 

{ 
struct    Reqld    RP   ri  rid; 

struct    REQtbl   definition   RP_ri_req; 

struct    rtemp   definition    *RP   ri  tmpl   ptr; 


struct    hashing   info  *RP   ri  hash; 
int        SrceDone; 


struct    ResultBuffer    *RB_pointer;  /*  result  buffer  */ 
struct    RP   rid   info    *next   RP   ridinfo; 

}; 

Figure  4.1    The  RP   Rid   Info  Structure 


This  structure  is  used  to  link  the  retrieve-common  structures  from  the  existing 
data  structures.  The  hash  tables  are  not  placed  directly  into  this  structure,  but 
are  accessed  by  a  pointer  in  this  structure  called  RP  ri  hash.  This  allows  a 
better  use  of  memory  because  non-retrieve-common  requests  set  this  pointer  to 
NULL  and*  no  further  retrieve-common  structures  are  allocated.  Retrieve-common 
requests  use  this  pointer  to  connect  a  hashing-info  structure  which  is  discussed  in 
the  next   section.      The  second  modification   required   is  to  add   a  boolean  flag 
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which  is  set  to  true  when  the  source  retrieve  has  completed.  This  flag  is  used  to 
determir.  if  the  merging  of  files  can  begin.  Recall  that  there  are  two  criteria 
which  must  be  satisfied  before  merging  can  occur,  namely,  all  the  target  records 
from  the  other  backends  must  be  received  and  the  source  and  target  retrieves  for 
this  backend  must  be  completed. 

B.    THE  HASHING  INFO  STRUCTURE 

The  hashing  info  structure,  used  to  hold  the  record  buffer  and  valuable 
control  information,  is  essential  for  the  efficiency  of  the  code.  Before  discussing 
this  structure,  an  overview  of  the  retrieve-common  processing  is  necessary. 

For  both  the  source  and  target  retrieves  of  the  retrieve-common  operation, 
records  are  physically  removed  from  the  disk  using  the  code  developed  for  the 
retrieve  command.  Just  prior  to  the  point  as  in  the  retrieve  request  when  the 
result  records  are  passed  back  to  the  controller  from  the  backends.  the  backend 
realizes  that  the  current  operation  is  the  retrieve-common  request,  not  the 
retrieve.  This  signals  the  backend  not  to  send  the  records  to  the  controller  but  to 
store  the  records  in  the  virtual  memory.  To  do  this,  each  record  is  hashed  on  its 
common  attribute  value,  and  then  stored  in  a  temporary'  buffer.  As  this  buffer 
becomes  full,  the  records  are  transferred  to  the  virtual  memory  and  the  virtual 
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address  is  placed  in  the  hash  tables.    When  the  last  record  for  a  request  has  been 
placed  in  the  virtual  memory,  several  options  may  occur: 


1.  If  this  is  a  target-retrieve,  the  target  records  which  are  accessible  from  the 
hash  tables  are  sent  to  the  other  backends. 

2.  If  all  of  the  target  records  from  the  other  backends  have  been  received  and 
the  source  and'  target  retrieves  at  this  backend  have  been  completed,  the 
pairwise  merge  can  begin. 

3.  If  either  its  local  target  retrieve  has  not  been  completed  or  the  target 
records  of  at  least  one  other  backend  have  not  been  received,  this  backend 
must  wait  before  beginning  to  merge  the  records. 


The  retrieve-common  is  allocated  with  two  RP  rid  info  structures  when  the 
record  processing  is  notified  of  the  request,  one  for  the  source  retrieve  and  one  for 
the  target  retrieve.  Certain  initializations  must  occur  at  this  time  including  the 
allocation-  of  the  hashing  info  structure.  When  the  first  record  to  be  stored  is 
received,  further  initialization  occurs.  The  value  of  the  common  attribute  is  used 
to  determine  the  type  of  hashing  functions  to  be  used  for  this  retrieve.  It  can  be 
one  of  three  types  -  string,  small  integer,  or  large  integer.  To  make  this 
determination,  the  template  information  must  be  examined.  The  template  holds 
the  value  type  of  the  attribute  as  well  as  the  maximum  and  minimum  values  that 
this  attribute  can  assume.  Once  this  information  is  gathered  from  the  template,  it 
is  not  collected  again  because  all  the  records  of  this  request  use  the  same  template 
and  share  the  same  characteristics.    For  this  reason,  the  hashing   info  structure 
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hold>  the  value  type,  minimum  value,  and  the  range  of  values.  A 
new  request  flag  is  used  to  determine  whether  this  initial  examination  of  the 
template  has  occurred.  The  target  counter  is  used  to  count  the  number  of 
backends  which  have  sent  target  records  for  this  request.  This  flag  is  incremented 
each  time  another  backend  informs  this  backend  that  the  last  record  has  been 
sent. 


struct    hashing   info 

{ 
struct  hash   result  hash   buffer; 

int  value  type; 

int  min, 

range. 

new   request   flag, 

target   cnt: 
struct  block  *hash_table[NUMBER_OF_BUCKETS] 
}: 

Figure  4.2    The  Hashing   Info  Structure 


After  the  hash  value  has  been  calculated  for  each  record,  the  record  is  not 
immediately  placed  in  the  hash  table  but  is  placed  in  a  temporary  buffer  called 
the  hashbuffer.  The  process  of  placing  records  in  the  hash  tables  can  be  lengthy 
and  for  this  reason  the  records  are  first  buffered  and  then  as  the  buffer  gets  full, 
several  records  can  be  placed  in  the  hash  tables  at  one  time. 
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The  final  element  in  the. hashing  info  structure  is  the  array  of  hash  tables. 
The  hash  tables  are  pointers  to  the  block  structure  which  stores  the  physical 
records.  These  pointers  are  organized  into  an  array  with  the  index  of  the  array 
being  the  hash  value.  Using  this  convention,  the  records  which  receive  a  hash 
value  223  will  be  found  by  referencing  hash  table[223].  The  composition  of  the 
block  structure  is  discussed  in  section  D. 

C.    THE  HASHRESULT  STRUCTURE 

The  purpose  of  the  hash  result  structure  shown  in  Figure  4.3  is  to  provide  a 
temporary    buffer    for    records    which    are    not    yet    stored    in    the    hash    tables. 


struct    hash  result 

{ 
struct  RP   rid   info  *  Origin   RP   ri  ptr. 

int  length; 

char  hashed_result[HR_SIZE  +  l]; 

}; 

Figure  4.3    The  Hash   Result  Structure 


The  primary  element  of  the  hash   result  is  the  character  array  which  is  used  to 
store  the  hashed  records,  called  the  hashed   result.    The  index  for  this  arrav  is  the 
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variable  length  which  is  also  an  element  of  this  structure.  The  third  element  of 
this  structure  is  the  Origin  RP  ri  ptr.  This  pointer  links  back  to  the  current 
RP  ri  ptr  and  is  used  so  that  the  high-level  information  such  as  the  request 
identification  number  and  the  request  table  can  be  accessed  from  the  lower  levels. 

D.    THE  BLOCK  STRUCTURE 

The  block  structure  shown  in  Figure  4.4  holds  the  records  for  each  hash  value. 
The       element       of      the      structure      which      holds      the      records       is      the 


struct    block 

{ 

int  length: 

struct  block  *next   block: 

char  contents[MAX_BLK_SIZE]; 

}; 

Figure  4.4    The  Block  Structure 


character  element  of  the  structure  which  holds  the  records  is  the  character  array, 
called  contents.  The  index  for  this  array  is  the  variable  length.  The  third 
element  of  this  structure  is  a  pointer  to  the  next  block,  called  next  block.  The 
number  and  size  of  records  which  must  be  stored  under  the  index  of  one  hash 
value  is  virtually  limitless.  Dynamic  allocation  of  blocks  must  occur  when  records 
cause  an  overflow  to  the  last  block.  This  pointer  is  used  to  connect  the  blocks  of 
one  hash  table  into  a  linked  list. 
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V.   THE  TESTING 

The  testing  of  a  software  project  is  the  single  most  important  task  in  the 
software  life  cycle.  It  is  during  this  stage  of  the  software  development  that  the 
requirements  definition,  detailed  design,  and  implementation  are  evaluated  and 
any  errors  which  are  discovered  are  corrected.  The  objective  of  the  testing  phase 
is  to  locate  as  many  errors  as  possible.  It  is  very  difficult  to  determine  when  a 
program  is  complete  and  correct.  There  is  a  well-known  saying  that  the  bug  to 
worry  about,  is  the  one  which  has  not  been  found.  This  is  certainly  true  for 
projects  as  large  as  MBDS. 

In  the  first  section  of  this  Ghapter  we  present  an  overview  of  the  types  of  tests 
which  we  have  determined  are  •  appropriate  for  the  retrieve-common 
implementation.  The  final  section  of  this  chapter  discusses  the  unit  tests  which 
have  been  conducted. 

A.    THE  TESTING  PROCESS 

Several  techniques  have  been  used  to  test  the  retrieve-common 
implementation.  These  techniques  include  unit  testing,  white-box  testing, 
and  black-box  testing.  During  a  unit  test,  segments  of  code  are  tested  in  an 
isolated  environment  in  an  attempt  to  determine  the  code's  correctness.  This 
allows  the  location  of  errors  to  be  discovered  easier  than  if  the  complete  code  is 
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being  tested.  During  white-box  and  black-box  testings  several  retrieve-common 
requests  are  used  to  test  an  implementation.  A  white-box  test  is  used  to 
demonstrate  that  every  line  of  code  properly  performs  the  desired  operation. 
This  implies  that  each  path  of  control  structures  (e.g..  loops  and  if-statements)  is 
tested.  A  black-box  test  is  used  to  demonstrate  that  the  program  is  correct  even 
for  several  different  types  of  input.  This  includes  boundary  cases,  that  is. 
positive,  negative,  and  zero  input  values,  and  inputs  which  are  close  to  satisfying 
a  request  but  off  enough  to  make  them  wrong. 

Unit  testing  in  the  strictest  sense,  as  single  modules,  is  not  possible  for  the 
retrieve-common  operation  because  very  few  of  the  modules  can  stand  alone. 
However,  we  have  been  able  to  divide  the  program  into  three  sections  and  test 
each  of  these  sections  individually.  The  first  section  has  been  tested  for  the 
modifications  to  the  REQP. process.  The  second  section  has  been  verified  for  the 
correctness  of  the  hashing  algorithm  in  a  single  backend.  The  final  section  has 
been  tested  for  the  broadcasting  and  receiving  of  target  records  in  a  multiple 
backend  environment.  These  tests  are  discussed  in  the  next  section  of  this 
chapter. 

Although  it  is  extremely  difficult  to  test  all  combinations  of  the  control 
structures  in  the  retrieve-common  implementation,  the  white-box  test  did  test 
each  processing  sequence.  For  the  retrieve-common  operation  there  are  several 
special  cases  which  have  been  considered.      These  included  testing  for  common 
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attributes  which  are  of  value  types  string,  small  integer,  and  large  integer,  as  well 
as  intermediate  and  final  results  which  overflow  the  buffer  spaces. 

The  black-box  test  was  used  to  test  unusual  inputs.  For  this  test  the 
retrieve-common  implementation  has  been  tested  for  requests  which  have  no 
records  to  be  returned,  requests  which  return  a  large  number  of  records,  and 
requests  with  nearly  identical  common  attributes  values. 

B.    THE  DETAILS  OF  THE  TESTING  PROCESS 

In  this  section,  we  discuss  the  purpose  of  each  phase  of  the  unit  testing.  This 
is  followed  by  the  method  of  measuring  the  results.  The  discussion  of  each  phase 
is  concluded  with  the  significant  errors  which  have  been  discovered  during  the 
phase  of  testing  and  the  corrective  action  which  has  been  required. 

The  first  phase  tested  the  modifications  to  the  REQP  modules.  To  conduct 
this  test,  the  necessary  modifications  to  the  code  have  been  made  and  a  retrieve- 
common  request  is  inputted.  Recall  that  the  retrieve-common  operation  is 
processed  the  same  as  two  retrieve  requests.  Since  the  hashing  function  has  not 
been  implemented  at  the  time  of  the  test,  the  results  from  the  processing  of  the 
source  and  target  retrieves  are  sent  to  the  host.  This  test  is  considered  complete 
when  these  results  are  received  at  the  host  computer.  Two  significant  errors  have 
been  discovered  during  this  test.  One  error  had  been  the  result  of  using  an 
outdated  version  of  the  lexical  source  code.  An  updated  version  had  been 
implemented   just    prior   to    the   start    of  the   retrieve-common    implementation. 
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Once  this  error  had  been  corrected,  the  execution  had  been  traced  from  TI  to  the 
source  code  of  the  compiler.  The  second  error  had  been  in  the  format  of  the 
request  when  sent  from  REQP  to  DM.  Under  close  examination  of  the  format. 
we  observed  that  the  number  identifying  the  request  had  been  initialized  to  zero 
rather  than  one.  This  error  had  been  significant,  but  easy  to  correct.  Once  these 
errors  had  been  corrected,  the  results  for  the  source  and  target  retrieves  had  been 
received  at  the  host  computer. 

The  purpose  of  the  second  phase  of  the  unit  testing  had  been  to  test  the 
retrieve-common  implementation  in  a  single-backend  environment.  This  test 
demonstrates  the  correctness  of  the  modifications  to  the  DM  process,  and  the 
implementation  of  the  hashing  algorithm,  the  use  of  the  virtual  storage,  and  the 
output  format.  Errors  which  had  been  found  in  this  section  of  code  were  minor 
and  easily  detected  and  corrected.  These  errors  had  included  syntax  errors  and 
infinite  loops  caused  by  searching  for  an  incorrect  delimiter. 

The  third  phase  of  the  retrieve-common  unit  testing  had  been  designed  to  test 

the   message   sending   and   message   receiving   functions.      To  perform   this   test, 

several  backends  had  been  loaded  with  the  database  records  and  the  transfer  of 

target  records  between  backends  had  been  examined.      It  is  during  this  phase  of 

testing  that  we  had  become  aware  that  the  new  message  type.  Retrieve-Common 

Notification,  is  required  as  discussed  in  Chapter  IV.      It  is  also  in  this  stage  that 

we  had  experienced  messages  being  lost  on  PCL  for  a  large  number  of  backends. 

This  problem  had  been  isolated  to  the  Get  NET  process.     The  messages  had  been 
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properly  placed  on  the  Ethernet  but  had  not  always  been  received  by  the  Get 
NET  process.  This  problem  has  been  solved  by  raising  the  priority  of  the  Get 
NET  process  in  order  to  guarantee  that  this  process  is  the  first  process  to  check 
the  message  queue. 
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VI.    CONCLUSION 

In  this  thesis  we  have  presented  the  implementation  of  the  retrieve-common 
operation  in  the  multi-backend  database  system.  The  original  specifications 
have  been  closely  followed  with  the  exceptions  as  noted  in  Chapter  III. 

The  MBDS  architecture  consists  of  a  controller  computer  and  one  or  more 
backend  computers.  Communication  between  these  computers  is  on  an 
Ethernet.  The  controller  manages  the  backends  by  assigning  them  database 
requests  to  process.  The  backends  perform  the  operations  required  and  send  the 
results  to  the  controller  as  appropriate.  The  software  is  organized  in  a  process 
structure  with  five  controller  processes,  five  backend  processes,  and  a  test  interface 
process  for  the  host  computer. 

MBDS  uses  an  attribute-based  data  model.  This  model  represents  database 
records  as  a  group  of  attribute- value  pairs  and  a  record  body.  The  primary 
operations.  INSERT.  DELETE.  RETRIEVE.  UPDATE,  and  now.  RETRIEVE- 
COMMON,  manipulate  the  data  by  accessing  the  records.  Not  every  record  is 
accessed  for  a  request  because  the  records  have  been  partitioned,  or  clustered,  by 
the  values  of  the  director!'  attributes.  This  allows  MBDS  to  access  only  those 
disk  addresses  which  may  have  relevant  records  in  them. 
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The  retrieve-common  request  is  used  to  merge  the  records  of  two  files  which 
share  a  common  attribute  value  for  specified  attributes.  The  syntax  of  the 
retrieve-common  request  resembles  the  syntax  of  the  retrieve  request.  This 
allows  the  processing  logic  of  the  retrieve-common  operation  to  use  many  of  the 
retrieve  procedures.  The  format  of  a  retrieve-common  request  is  a  source  retrieve 
request  followed  by  a  common  attribute  for  the  source  request,  a  common 
attribute  for  the  target  retrieve  request,  and  a  target  retrieve  request.  Using  this 
format,  a  retrieve-common  request  can  be  transformed  into  two  retrieve  requests 
by  placing  the  respective  common  attributes  into  the  source  and  target  retrieves. 

The  processing  of  a  retrieve-common  request  differs  from  the  retrieve  request 
after  the  records  have  been  fetched  from  the  auxiliary  memory  device.  Rather 
than  return  the  results  to  the  host,  they  are  hashed  on  the  common  attribute 
value  and- placed  into  the  virtual  storage  with  the  logical  address  placed  in  a  hash 
table.  This  occurs  for  both  the  source  and  target  retrieve.  The  target  retrieve 
results  are  then  sent  to  all  other  backends  so  that  each  backend  has  only  local 
source  retrieve  results  but  have  all  the  target  retrieve  results  in  the  system.  After 
these  additional  records  have  been  stored  in  the  virtual  storage  of  the  backend. 
each  source  retrieve  record  is  compared  with  each  target  retrieve  records.  If  the 
respective  common  attributes  are  identical  the  two  records  are  merged  and  the 
results  are  sent  to  the  host  computer. 

In    summary,    we    have    implemented    an    extremely    useful    and    desirable 

function.      The    retrieve-common    operation    makes    MBDS    more    complete    by 
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providing  a  method  of  merging  two  files.      With  the  completion  of  this  thesis,  the 
final  MBDS  primary  operation  has  been  implemented. 


46 


APPENDIX  A 


THE  TESTING  RESULTS 


In  this  appendix,  we  present  the  test  results  for  the  retrieve-common  operation. 
A.   THE  INSERTED  RECORDS 


TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 


001  [INSERT 
002JINSERT 
003JINSERT 
004[INSERT 
005JINSERT 
006ilNSERT 
007IINSERT 
008[INSERT 
009IINSERT 
OlOflNSERT 
Oil  [INSERT 
012IINSERT 
013IINSERT 
OHflNSERT 
015IINSERT 
016[INSERT 
017JINSERT 
018[INSERT 
019[INSERT 
020[INSERT 


<  TEMP, Part  > 
<TEMP,Part> 
<TEMP,Part> 
<TEMP.Sups> 
<TEMP,Sups> 
<TEMP,Sups> 
<TEMP,Sups> 
<TEMP,Sups> 

<  TEMP, Sups  > 
<TEMP.Sups> 
<TEMP.Sups> 
<TEMP.Ship> 
<TEMP.Ship> 
<TEMP,Ship> 
<TEMP,Ship> 
<TEMP.Ship> 
<TEMP,Ship> 
<TEMP,Ship> 
<TEMP.Ship> 
<TEMP,Ship> 


.<PNO,Pl>,<NAME,Idm>,<CITY.Mont>)] 

,<PNO,P2>,<NAME,Xyz>,<CITY,Sali>)' 

.<PNO,P3>,<NAME,Nut>,<CITY,Colu>)] 

,<SNO,Sl>,<NAME.Nut>)] 

.<SNO,S2>.<NAME,Nut>)j 

,<SNO.S2>,<NAME.Nut>)] 

.<SNO,Sl>.<NAME,Dec>)] 

.<SNO,S3>.<NAME.Nut>)] 

.<SNO.S3>.<NAME.Deo)j 

.<SNO.S4>.<NAME.Nut>)] 

.<SNO.S4>.<NAME.Deo)j 

.<SNO.S1>.<PNO.P2>.<QTY,500>) 

.<SNO.S2>,<PNO.P2>.<QTY.500>) 

.<SNO.S3>.<PNO.P1>.<QTY.500>) 

.<SNO.S4>,<PNO,P2>.<QTY,1000>)] 

.<SNO.S1>.<PNO.P2>.<QTY.1000>); 

.<SNO.S2>.<PNO.P2>.<QTY,1000>)1 

.<SNO.S3>,<PNO.P1>.<QTY.2000>)1 

.<SNO.S4>.<PNO.P2>.<QTY,2000>)1 

,<SNO.Sl>,<PNO,P2>.<QTY,2000>)j 


47 


B.    TEST  NTMBER  1. 

The  first  test  demonstrates  that  a  typical  retrieve  is  properly  processed. 
TEST   021  RETRIEVE(TEMP=Sups)(SNO.NAME) 

SNO  -  S3 

NAME  =  Nut 

SNO  =  S4 

NAME  =  Nut 

SNO  -  SI 

NAME  -  Dec 

SNO  =  S3 
NAME  -  Dec 

SNO  =  S4 
NAME  =  Dec 

SNO  =  SI 

NAME  =  Nut 

SNO  =  S2 
NAME  -  Nut 

SNO  =  S2 
NAME  =  Nut 
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C.     TEST  NUMBER  2. 

The  second  test  shows  the  results  for  another  retrieve. 
TEST  022iRETRIEVE(TEMP=Ship)(PNO.SNO,QTY); 

PNO  =  P2 

SNO  =  S4 
QTY  =  2000 

PNO  =  P2 
SNO  =  SI 
QTY  =  2000 

PNO  =  PI 
SNO  =  S3 
QTY  =  2000 

PNO  =  P2 

SNO  -  SI 
QTY  =  1000 

PNO  -  P2 

SNO  =  S2 
QTY  =  1000 

PNO  =  PI 
SNO  =  S3 
QTY  -  500 

PNO  =  P2 
SNO  =  SI 

QTY  =  500 

PNO  =  P2 

SNO  =  S2 
QTY  =  500 

PNO  =  P2 
SNO  =  S4 
QTY  =  1000 
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D.    TEST  NUMBER  3. 

This  test  demonstrates  the  results  of  a  retrieve-common  request  which 
combines  the  results  from  Test  1  and  Test  2. 


TEST  023!RETRIEVE(TEMP=Sups)(SNO,NAME) 
COMMON(SNO.SNO) 
RETRIEVE(TEMP=Ship)(PNO,SNO,QTY)] 


<  COMMON. F 
<COMMON,F 
<COMMON,F 
<COMMON.F 

<  COMMON. F 

<  COMMON. F 
<COMMON.F 
<COMMON.F 
<COMMON.F 

<  COMMON. F 
<COMMON.F 
<COMMON.F 
<COMMON,F 
<COMMON.F 
<COMMON.F 
<COMMON.F 
<COMMON.F 
<COMMON,F 


le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 


<SNO.Sl> 
<SNO.Sl> 
<SNO,Sl> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.S3> 
<SNO.Sl> 
<SNO.S2> 
<SNO.S3> 
<SNO.S3> 
<SNO,S2> 
<SNO.S2> 
<SNO,S3> 
<SNO.S4> 
<SNO,S2> 
<SNO.S4> 
<SNO.S4> 
<SNO,S4> 


<NAME, 
<NAME. 
<NAME, 
<NAME. 
<NAME. 
<NAME 
<NAME. 
<NAME, 
<NAME, 
<NAME 
<NAME 
<NAME. 
<NAME 
<NAME 
<NAME 
<NAME 
<NAME. 
<NAME 


Deo 

Deo 
Nut> 
Nut> 
Deo 
Nut> 
Nut> 
Nut> 
Nut> 
Deo 
Nut> 
Nut> 
Deo 
Nut> 
Nut> 
Nut> 
Deo 
Deo 


<PN0.P2> 
<PN0.P2> 
<PN0.P2> 
<PNO,P2> 
<PN0.P2> 
<PNO.Pl> 
<PNO,P2> 
<PN0.P2> 
<PNO.Pl> 
<PNO.Pl> 
<PNO.P2> 
<PN0,P2> 
<PN0.P1> 
<PNO.P2> 
<PN0,P2> 
<PNO.P2> 
<PN0,P2> 
<PN0,P2> 


<SNO.Sl> 
<SNO.Sl> 
<SN0.S1> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.S3> 
<SN0.S1> 
<SNO.S2> 
<SNO.S3> 
<SN0.S3> 
<SNO,S2> 
<SNO,S2> 
<SNO.S3> 
<SN0.S4> 
<SN0.S2> 
<SNO.S4> 
<SNO.S4> 
<SN0,S4> 


<QTY.2000> 

<QTY.1000> 

<QTY,500> 

<QTY.2000> 

<QTY.500> 

<QTY.2000> 

<QTY.1000> 

<QTY,500> 

<QTY,500> 

<QTY.2000> 

<QTY.1000> 

<QTY.500> 

<QTY.500> 

<QTY.2000> 

<QTY.1000> 

<QTY.1000> 

<QTY.2000> 

<QTY.1000> 
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E.    TEST  NUMBER  4. 

This  test  demonstrates  another  retrieve  common  request. 

TEST   024iRETRIEVE(TEMP=Part)(PNO.NAME) 
COMMON(PNO.PNO) 
RETRIEVE(TEMP=Ship)(SNO,QTY)j 


<COMMON.File>  <PNO.Pl>  <NAME.Idm>  <SNO.S3>  <QTY.2000> 
<COMMON.File>  <PNO.Pl>  <NAME,Idm>  <SNO.S3>  <QTY,500> 
<COMMON,File>  <PNO.P2>  <NAME,Xyz>  <SNO,Sl>  <QTY.500> 
<COMMON.File>  <PNO.P2xNAME,Xyz>  <SNO.S2>  <QTY.500> 
<COMMON.File>  <PNO,P2>  <NAME,Xyz>  <SNO,S4>  <QTY.1000> 
<COMMON.File>  <PNO.P2>  <NAME.Xyz>  <SNO.S4>  <QTY.2000> 
<COMMON.File>  <PNO.P2>  <NAME.Xyz>  <SNO.Sl>  <QTY.2000> 
<COMMON.File>  <PNO.P2>  <NAME.Xyz>  <SNO.Sl>  <QTY.1000> 
<COMMON.File>  <PNO.P2>  <NAME.Xyz>  <SNO.S2>  <QTY.1000> 
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F.    TEST  NUMBER  5.     This  test  demonstrates  a  retrieve-common  request  using 
the  common  value  as  an  integer  type. 

TEST  025[RETRIEVE(TEMP=Ship)(SNO,QTY,PNO) 
COMMON(QTY.QTY) 
RETRIEVE(TEMP=Ship)(SNO,PNO)] 


<COMMON. 
<COMMON. 
<COMMON. 

<  COMMON, 

<  COMMON. 

<  COMMON. 

<  COMMON. 

<  COMMON. 

<  COMMON. 

<  COMMON. 
<COMMON 

<  COMMON 

<  COMMON. 

<  COMMON 

<  COMMON 

<  COMMON 

<  COMMON 

<  COMMON 

<  COMMON 
COMMON 

<  COMMON 

<  COMMON. 

<  COMMON 

<  COMMON. 
'COMMON 
<COMMON 

<  COMMON 


le>  <SNO.S3>  <QTY.500>  <PNO.Pl>  <SNO.S3>  <PNO.Pl> 
le>  <SNO,S3>  <QTY.500>  <PNO.Pl>  <SNO,Sl>  <PNO.P2> 
le>  <SNO,Sl>-<QTY,500>  <PNO.P2>  <SNO.Sl>  <PNO.P2> 
le>  <SNO.Sl>  <QTY.500>  <PNO.P2>  <SNO.S2>  <PNO.P2> 
le>  <SNO.S3>  <QTY.500>  <PNO.Pl>  <SNO,S2>  <PNO.P2> 
le>  <SNO.Sl>  <QTY.1000>  <PNO,P2>  <SNO,Sl>  <PNO.P2> 
le>  <SNO.Sl>  <QTY,500>  <PNO.P2>  <SNO.S3>  <PNO.Pl> 
le>  <SNO,S2>  <QTY.500>  <PNO.P2>  <SNO.Sl>  <PNO.P2> 
le>  <SNO.Sl>  <QTY.1000>  <PNO,P2>  <SNO.S2>  <PNO.P2> 
le>  <SNO.Sl>  <QTY.1000>  <PNO,P2>  <SNO.S4>  <PNO.P2> 
le>  <SNO.S2>  <QTY.1000>  <PNO,P2>  <SNO.Sl>  <PNO,P2> 
le>  <SNO.S2>  <QTY.1000>  <PNO.P2>  <SNO.S2>  <PNO.P2> 
le>  <SNO,S2>  <QTY.500>  <PNO.P2>  <SNO.S2>  <PNO.P2> 
le>  <SNO.S2>  <QTY.500>  <PNO.P2>  <SNO.S3>  <PNO.Pl> 
le>  <SNO.S4>  <QTY.1000>  <PNO,P2>  <SNO.S4>  <PNO,P2> 
le>  <SNO.S4>  <QTY,1000>  <PNO,P2>  <SNO.Sl>  <PNO.P2> 
le>  <SNO.S2>  <QTY.1000>  <PNO.P2>  <SNO.S4>  <PNO.P2> 
le>  <SNO.S4>  <QTY,2000>  <PNO,P2>  <SNO.S4>  <PNO.P2> 
le>  <SNO,S4>  <QTY.1000>  <PNO.P2>  <SNO.S2>  <PNO,P2> 
le>  <SNO?Sl>  <QTY.2000>  <PNO.P2>  <SNO.S3>  <PNO.Pl> 
le>  <SNO.S3>  <QTY.2000>  <PNO.Pl>  <SNO.S4>  <PNO.P2> 
le>  <SNO.S3>  <QTY.2000>  <PNO.Pl>  <SNO.Sl>  <PNO,P2> 
le>  <SNO,S3>  <QTY.2000>  <PNO,Pl>  <SNO.S3>  <PNO,Pl> 
le>  <SNO.S4>  <QTY.2000>  <PNO,P2>  <SNO.Sl>  <PNO.P2> 
le>  <SNO.S4>  <QTY.2000>  <PNO,P2>  <SNO.S3>  <PNO,Pl> 
le>  <SNO,Sl>  <QTY.2000>  <PNO.P2>  <SNO.S4>  <PNO.P2> 
le>  <SNO,Sl>  <QTY.2000>  <PNO.P2>  <SNO.Sl>  <PNO,P2> 
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APPENDIX  B 

A  WALK  THROUGH  THE  USER  INTERFACE 

In  this  appendix  we  present  a  walk  through  the  user  interface.  Two  aspects  of  run- 
ning retrieve-common  requests  are  displayed.  The  first  request  is  executed  from  an  exist- 
ing file  of  traffic  untis.  The  second  request  has  been  built  on-line.  User  inputs  are  in  bold 
letters. 

How  many  backends  are  there?  (1.1:....)>7 


Do  you  want  de-bugging  messages  printed?  (y/n)>y 
What  operation  would  you  like  to  perform? 

(g)    -   generate  database 

(1)    -    load  database 

(e)    -    execute  test  interface 

(x)    -   exit  to  operating  system 

(z)    -   exit  and  Stop  MDBS 


1 

What  operation  would  you  like  to  perform? 

(t)  -    load  the  template  and  descriptor  files 

(r)  - .  mass  load  a  file  of  records 

(x)  -   exit,  return  to  previous  menu 
t 

ENTER  NAME  OF  FILE  CONTAINING  TEMPLATE  INFORMATION: 

tt.f 


ENTER  NAME  OF  FILE  CONTAINING  THE  DESCRIPTORS: 
td.f 
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What  operation  would  you  like  to  perform0 

(t)    -    load  the  template  and  descriptor  files 
(r)    -    mass  load  a  file  of  records 
(x)    -    exit,  return  to  previous  menu 


NOTE  TO  THE  USER!!!!  YOU  MUST  HAVE  LOADED  THE  TEMPLATES 
AND  DESCRIPTORS  FOR  A  DATABASE,  BEFORE  ATTEMPTING  TO  LOAD 
ANY  RECORDS  INTO  THE  DATABASE!!!! 


ENTER  NAME  OF  FILE  CONTAINING  RECORDS  TO  BE  LOADED: 


tr20.f 


TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 
TEST 


001  (INSERT 

002  INSERT 
003IINSERT 
004 1  INSERT 
005  .INSERT 
006IINSERT 
007IINSERT 
008IINSERT 
009!  INSERT 
OlOilNSERT 
OlllINSERT 
012IINSERT 
013IINSERT 
014IINSERT 
OlsllNSERT 
016IINSERT 
017JINSERT 
018UNSERT 
019IINSERT 
020IINSERT 


<TEMP.Part>,<PNO.Pl> 
<TEMP.Part>,<PNO.P2> 
<TEMP,Part>,<PNO,P3> 
<TEMP.Sups>,<SNO.Sl>. 
<TEMP.Sups>,<SNO.S2>. 
<TEMP.Sups>,<SNO.S2>. 
<TEMP.Sups>,<SNO.Sl>. 
<TEMP.Sups>,<SNO.S3>. 
<TEMP,Sups>xSNO.S3>. 
<TEMP.Sups>.<SNO.S4>. 
<TEMP.Sups>.<SNO.S4>. 
<TEMP,Ship>.<SNO,Sl>. 
<TExMP.Ship>.<SNO.S2>. 
<TEMP,Ship>.<SNO.S3>. 
<TEMP.Ship>,<SNO,S4>. 
<TEMP,Ship>.<SNO,Sl>, 
<TEMP,Ship>.<SNO.S2>, 
^TEMP.Ship>,<SNO,S3>. 
<TEMP.Ship>.<SNO,S4>. 
<TEMP.Ship>.<SNO.Sl>. 


,<NAME.Idm>.<CITY.Mont>)i 

.<NAME.Xyz>.<CITY.Sali>) 

,<NAME.Nut>,<CITY,Colu>)] 

<NAME.Nut>)j 

<NAME.Nut>)| 

<NAME.Nut>)j 

<NAME.Dec>)| 

<NAME,Nut>)] 

<NAME,Dec>)] 

<NAME.Nut>)j 

<NAME.Dec>)i 

<PNO.P2>,<QTY.500>)j 

<PNO.P2>,<QTY,500>)j 

<PNO.P1>,<QTY,500>)] 

<PNO.P2>,<QTY,1000>) 

<PNO,P2>,<QTY.1000>)i 

<PNO.P2>,<QTY,1000>)j 

<PNO.Pl>.<QTY,2000>)i 

<PNO.P2>,<QTY,2000>): 

<PNO.P2>,<QTY.2000>); 
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What  operation  would  you  like  to  perform? 

(t)    -    load  the  template  and  descriptor  files 

(r)    -    mass  load  a  file  of  records 

(x)    -   exit,  return  to  previous  menu 
x 
What  operation  would  you  like  to  perform? 

(g)    -   generate  database 

(1)   -   load  database 

(e)   -   execute  test  interface 

(x)    -   exit  to  operating  system 

(z)   -   exit  and  Stop  MDBS 
e 
Do  you  ALWAYS  want  to  wait  for  responses?  (y/n) 

>  y 

Enter  the  type  of  subsession  you  want 

(r)     REDIRECT  OUTPUT;  select  output  for  answers 
(d)    NEW  DATABASE:  choose  a  new  database 
(n)    NEW  LIST:  create  a  new  list  of  traffic  units 
(m)  MODIFY:  modify  an  existing  list  of  traffic  units 
(s)    SELECT:  select  traffic  units  from  an  existing  list 

(or  give  new  traffic  units)  for  execution 
(o)    OLD  LIST:  execute  all  the  traffic  units  in  an 

existing  list 
(p)    PERFORMANCE  TESTING 
(x)    EJCIT:  return  to  generate. load. execute,  or  exit  menu 

SELECTIONS    s 


Enter  the  name  for  the  traffic  unit  file 

It  may  be  up  to  13  characters  long  including  the  .ext. 

Filenames  may  include  only  one  r#'  character 

as  the  first  character  before  the  version  number. 

File  name>    tRCreq.f 
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List  of  executable  traffic  units 
(0)  RETRIEVE(TEMP=Sups)(SNO.NAME) 
COMMON(SNO.SNO) 
RETRIEVE(TEMP=Ship)(PNO.SNO.QTY) 

(l)[RETRIEVE(TEMP=Sups)(SNO,NAME); 

(2)[RETRIEVE(TEMP=Ship)(PNO,SNO.QTY)] 

(3)jRETRIEVE(TEMP=Part)(PNO.NAME) 
COMMON  (PNO.PNO) 
RETRIEVE(TEMP=Ship)(SNO,QTY)] 

(4)iRETRIEVE(TEMP=Part)(PNO.NAME)] 

(5)!RETRIEVE(TEMP=Ship)(SNO,QTY.PNO) 
COMMON(QTY,QTY) 
RETRIEVE(TEMP=Ship)(SNO,PNO)] 

(6)  RETRIEVE(TEMP=Ship)(SNO.QTY.PNO)] 


/*This  section  shows  how  to  invoke  a  predefined  retrieve-common  request*/ 

Select  Options 

(d)    display  the  traffic  units  in  the  list 
(n)    enter  a  new  traffic  unit  to  be  executed 
(num)      execute  the  traffic  unit  at  inuml 
(x)    exit  from  this  SELECT  subsession 

Option >    0 
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TEST   023  RETRIEYE(TEMP=Sups)(SNO.NAME) 
COMMON(SNO.SNO) 
RETRIEYE(TEMP=Ship)(PNO.SNO.QTY) 


<COMMON.F 
<COMMON,F 
<COMMON.F 
<COMMON.F 
<COMMON.F 
<COMMON,F 
<COMMON,F 
<COMMON,F 

<  COMMON, F 
<COMMON.F 
<COMMON.F 
<COMMON.F 
<COMMON,F 
<COMMON,F 
<COMMON.F 
<COMMON,F 
<COMMON.F 

<  COMMON. F 


le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 


<SNO.Sl> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.S3> 
<SNO.Sl> 
<SNO.S2> 
<SNO.S3> 
<SNO.S3> 
<SNO.S2> 
<SNO.S2> 
<SNO.S3> 
<SNO.S4> 
<SNO.S2> 
<SNO,S4> 
<SNO.S4> 
<SNO.S4> 


<  NAME.  Deo 
<NAME.Dec> 
<NAME,Nut> 
<NAME,Nut> 
<NAME,Dec> 
<NAME,Nut> 
<NAME,Nut> 
<NAME.Nut> 
<NAME,Nut> 
<NAME,Dec> 
<NAME,Nut> 
<NAME,Nut> 

<  NAME.  Deo 
<NAME,Nut> 
<NAME.Nut> 
<NAME,Nut> 
<NAME,Deo 
<NAME,Deo 


<PNO.P2> 
<PNO.P2> 
<PNO,P2> 
<PNO.P2> 
<PNO,P2> 
<PN0,P1> 
<PNO,P2> 
<PNO,P2> 
<PN0,P1> 
<PNO,Pl> 
<PNO,P2> 
<PNO,P2> 
<PN0.P1> 
<PNO,P2> 
<PNO,P2> 
<PNO,P2> 
<PNO,P2> 
<PNO,P2> 


<SN0.S1> 
<SN0,S1> 
<SN0.S1> 
<SNO.Sl> 
<SN0.S1> 
<SNO.S3> 
<SN0.S1> 
<SNO,S2> 
<SNO,S3> 
<SNO.S3> 
<SNO,S2> 
<SNO.S2> 
<SNO.S3> 
<SNO.S4> 
<SNO.S2> 
<SNO.S4> 
<SNO.S4> 
<SNO.S4> 


<QTY.2000> 

<QTY.1000> 

<QTY,500> 

<QTY,2000> 

<QTY,500> 

<QTY,2000> 

<QTYJ000> 

<QTY.500> 

<QTY.500> 

<QTY.2000> 

<QTY.1000> 

<QTY.500> 

<QTY.500> 

<QTY.2000> 

<QTY.1000> 

<QTY,1000> 

<QTY.2000> 

<QTY.1000> 


/*  This  section  describes  the  method  to  build  a  retrieve-common  request*/ 

Select  Options 

(d)    display  the  traffic  units  in  the  list 
(n)    enter  a  new  traffic  unit  to  be  executed 
(num)      execute  the  traffic  unit  at  [numj 
(x)    exit  from  this  SELECT  subsession 

Option  >    n 

Enter  the  character  for  the  desired  Traffic  Unit  type. 


(r)     Request 

(t)     Transaction  (multiple  requests) 

(f)     Finished  entering  traffic  units. 

Letter>    r 
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Enter  the  character  for  the  desired  next  step. 


(i)  INSERT 

(r)  RETRIEVE 

(u)  UPDATE 

(d)  DELETE 

(c)  RETRIEVE  COMMON 

LETTER>  c 


RETRIEVE  COMMON  Request 
First  enter  the  source  retrieve  request 


RETRIEVE  Request 

Enter  responses  as  you  are  prompted.    You  will  be  prompted  first  for 
the  predicates  of  the  query,  then  attributes  for  the  target-list, 
next  for  an  attribute  for  the  optional  BY  clause  and  finally  for 
a  pointerfor  the  optional  WITH  clause. 

When  you  have  finished  entering  predicates  for  the  query,  respond 
to  the  ATTRIBUTE>  prompt  with  a  <return>. 

ATTRIBUTE>    TEMP 
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Enter  the  character  for  the  desired  relational  operator 

(a)  =   EQUAL 

(b)  =  NOT  EQUAL 

(c)  >   GREATER  THAN 

(d)  >=  GREATER  THAN  or  EQUAL 

(e)  <   LESS  THAN 

(f)  <=  LESS  THAN  or  EQUAL 

Letter>    a 
Value>    Sups 


So  far  your  conjunction  is 

(TEMP=Sups). 

Do  you  wish  to  'and1  additional  predicates  to  this  conjunction?  (y/n) 

n 

Do  you  wish  to  append  more  conjunctions  to  the  query?  (y/n) 

n 

Begin  entering  attributes  for  the  Target-List.    When  you  are 

through  entering  attributes  respond  to  the  ATTRIBUTE>  prompt  with  <return>. 

Do  you  wish  to  be  prompted  for  aggregation? 

n 

ATTRIBUTE >    SNO 

ATTRIBUTE>    NAME 

ATTRIBUTE> 

COMMON  ATTRIBUTE  1>    SNO 

COMMON  ATTRIBUTE  2>    SNO 


The  request  being  built  is: 
RETRIEVE(TEMP=Sups)(SNO.NAME)COMMON(SNO.SNO)  4 
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Enter  the  target  retrieve 

RETRIEVE  Request 

Enter  responses  as  you  are  prompted.    You  will  be  prompted  first  for 
the  predicates  of  the  query,  then  attributes  for  the  target-list, 
next  for  an  attribute  for  the  optional  BY  clause  and  finally  for 
a  pointerfor  the  optional  WITH  clause. 

When  you  have  finished  entering  predicates  for  the  query,  respond 
to  the  ATTRIBUTE>  prompt  with  a  <return>. 

ATTRIBUTE >    TEMP 

Enter  the  character  for  the  desired  relational  operator 

(a)  =   EQUAL 

(b)  =  NOT  EQUAL 

(c)  >   GREATER  THAN 

(d)  >=  GREATER  THAN  or  EQUAL 

(e)  <   LESS  THAN 

(f)  <=  LESS  THAN  or  EQUAL 

Letter >    a 

Value  >    Ship 

So  far  vouf  conjunction  is 

(TEMP=Ship). 

Do  you  wish  to   and*  additional  predicates  to  this  conjunction?  (y/n) 

n 

Do  you  wish  to  append  more  conjunctions  to  the  query?  (y/n) 

n 

Begin  entering  attributes  for  the  Target-List.    When  you  are 

through  entering  attributes  respond  to  the  ATTRIBUTE>  prompt  with  <return>. 

Do  you  wish  to  be  prompted  for  aggregation? 

n 

ATTRIBUTE^   SNO 

ATTRIBUTE  >    PNO 

ATTRIBUTE>    QTY 
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The  request  being  processed  is: 
RETRIEVE(TEMP=Sups)(SNO,NAME) 
COMMON(SNO.SNO) 
RETRIEVE(TEMP=Ship)(SNO.PNO,QTY)"i 


<COMMON.F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON.F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<  COMMON. F 
<COMMON,F 
<COMMON,F 
<COMMON,F 
<COMMON.F 


Select  Options 


le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 
le> 


<SNO,Sl> 
<SNO,Sl> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.Sl> 
<SNO.S3> 
<SNO,Sl> 
<SNO,S2> 
<SNO,S3> 
<SNO,S3> 
<SNO.S2> 
<SNO.S2> 
<SNO,S3> 
<SNO.S4> 
<SNO,S2> 
<SNO,S4> 
<SNO,S4> 
<SNO,S4> 


<NAME,Dec> 
<NAME.Dec> 
<NAME.Nut> 
<NAME,Nut> 

<  NAME. Deo 
<NAME.Nut> 
<NAME.Nut> 
<NAME,Nut> 
<NAME.Nut> 
<NAME,Dec> 
<NAME.Nut> 
<NAME.Nut> 
<NAME.Dec> 
<NAME.Nut> 
<NAME.Nut> 
<NAME.Nut> 
<NAME.Dec> 

<  NAME.  Deo 


<PN0,P2> 
<PNO,P2> 
<PNO,P2> 
<PN0,P2> 
<PN0.P2> 
<PNO.Pl> 
<PNO.P2> 
<PNO,P2> 
<PN0.P1> 
<PNO,Pl> 
<PN0.P2> 
<PN0,P2> 
<PNO,Pl> 
<PNO,P2> 
<PNO,P2> 
<PNO.P2> 
<PNO,P2> 
<PNO,P2> 


<SNO,Sl> 
<SN0,S1> 
<SNO.Sl> 
<SN0.S1> 
<SN0,S1> 
<SN0.S3> 
<SN0,S1> 
<SNO,S2> 
<SN0,S3> 
<SN0,S3> 
<SN0,S2> 
<SN0,S2> 
<SN0,S3> 
<SN0,S4> 
<SN0,S2> 
<SNO,S4> 
<SN0,S4> 
<SN0,S4> 


<QTY,2000> 

<QTY.1000> 

<QTY.500> 

<QTY.2000> 

<QTY,500> 

<QTY.2000> 

<QTY.1000> 

<QTY.500> 

<QTY.500> 

<QTY.2000> 

<QTY.1000> 

<QTY,500> 

<QTY,500> 

<QTY.2000> 

<QTY,1000> 

<QTY.1000> 

<QTY,2000> 

<QTY,1000> 


(d)    display  the  traffic  units  in  the  list 
(n)    enter  a  new  traffic  unit  to  be  executed 
(num)      execute  the  traffic  unit  at   num! 
(x)    exit,  from  this  SELECT  subsession 

Option  >    x 

Enter  the  type  of  subsession  you  want 

(r)    REDIRECT  OUTPUT;  select  output  for  answers 
(d)    NEW  DATABASE;  choose  a  new  database 
(n)    NEW  LIST:  create  a  new  list  of  traffic  units 
(m)  MODIFY;  modify  an  existing  list  of  traffic  units 
(s)    SELECT:  select  traffic  units  from  an  existing  list 

(or  give  new  traffic  units)  for  execution 
(o)    OLD  LIST:  execute  all  the  traffic  units  in  an 

existing  list 
(p)    PERFORMANCE  TESTING 
(x)    EXIT;  return  to  generate, load, execute,  or  exit  menu 

SELECTION>    x 
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What  operation  would  you  like  to  perform? 

(g)    -    generate  database 

(1)    -    load  database 

(e)    -    execute  test  interface 

(x)    -    exit  to  operating  system 

(z)    -   exit  and  Stop  MDBS 
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